Dense RGB-D semantic mapping with Pixel-Voxel neural network
نویسندگان
چکیده
For intelligent robotics applications, extending 3D mapping to 3D semantic mapping enables robots to, not only localize themselves with respect to the scene’s geometrical features but also simultaneously understand the higher level meaning of the scene contexts. Most previous methods focus on geometric 3D reconstruction and scene understanding independently notwithstanding the fact that joint estimation can boost the accuracy of the semantic mapping. In this paper, a dense RGB-D semantic mapping system with a Pixel-Voxel network is proposed, which can perform dense 3D mapping while simultaneously recognizing and semantically labelling each point in the 3D map. The proposed Pixel-Voxel network obtains global context information by using PixelNet to exploit the RGB image and meanwhile, preserves accurate local shape information by using VoxelNet to exploit the corresponding 3D point cloud. Unlike the existing architecture that fuses score maps from different models with equal weights, we proposed a Softmax weighted fusion stack that adaptively learns the varying contributions of PixelNet and VoxelNet, and fuses the score maps of the two models according to their respective confidence levels. The proposed Pixel-Voxel network achieves the state-of-the-art semantic segmentation performance on the SUN RGB-D benchmark dataset. The runtime of the proposed system can be boosted to 11-12Hz, enabling near to real-time performance using an i7 8-cores PC with Titan X GPU.
منابع مشابه
RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments
RGB-D cameras (such as the Microsoft Kinect) are novel sensing systems that capture RGB images along with per-pixel depth information. In this paper we investigate how such cameras can be used for building dense 3D maps of indoor environments. Such maps have applications in robot navigation, manipulation, semantic mapping, and telepresence. We present RGB-D Mapping, a full 3D mapping system tha...
متن کاملDA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks
3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of a scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recur...
متن کاملVolume-Based Semantic Labeling with Signed Distance Functions
Research works on the two topics of Semantic Segmentation and SLAM (Simultaneous Localization and Mapping) have been following separate tracks. Here, we link them quite tightly by delineating a category label fusion technique that allows for embedding semantic information into the dense map created by a volume-based SLAM algorithm such as KinectFusion. Accordingly, our approach is the first to ...
متن کاملRGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments
RGB-D cameras are novel sensing systems that capture RGB images along with per-pixel depth information. RGB-D cameras rely on either structured light patterns combined with stereo sensing [6,10] or time-of-flight laser sensing [1] to generate depth estimates that can be associated with RGB pixels. Very soon, small, high-quality RGB-D cameras developed for computer gaming and home entertainment ...
متن کاملFast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks
In semantic scene segmentation, every pixel of an image is assigned a category label. This task can be made easier by incorporating depth information, which structured light sensors provide. Depth, however, has very di erent properties from RGB image channels. In this paper, we present a novel method to provide depth information to convolutional neural networks. For this purpose, we apply a sim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.00132 شماره
صفحات -
تاریخ انتشار 2017